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Abstract. This paper formalizes the optimal base problem, presents 
^ an algorithm to solve it, and describes its application to the encod- 

Cw ing of Pseudo-Boolean constraints to SAT. We demonstrate the im- 

^ pact of integrating our algorithm within the Pseudo-Boolean constraint 

-H solver MiniSat^. Experimentation indicates that our algorithm scales 

to bases involving numbers up to 1,000,000, improving on the restriction 
_^ in MiniSat"*" to prime numbers up to 17. We show that, while for many 

^ examples primes up to 17 do sufhce, encoding with respect to optimal 

^ bases reduces the CNF sizes and improves the subsequent SAT solving 

time for many examples. 



1 Introduction 



cn 

JZ. The optimal base problem is all about finding an efiicient representation for a 

r^ given collection of positive integers. One measure for the efficiency of such a 

^\ representation is the sum of the digits of the numbers. Consider for example 

■^ the decimal numbers S — {16, 30, 54, 60}. The sum of their digits is 25. Taking 

I^ binary representation we have 5'(2) = {10000,11110,110110, 111100} and the 

^^ sum of digits is 13, which is smaller. Taking ternary representation gives 5(3) = 

O {121, 1010, 2000, 2020} with an even smaller sum of digits, 12. Considering the 

. . mixed radix base B = (3,5,2,2), the numbers are represented as 5(b) = {101, 

^ 1000, 1130, 10000} and the sum of the digits is 9. The optimal base problem is to 

find a (possibly mixed radix) base for a given sequence of numbers to minimize 

the size of the representation of the numbers. When measuring size as "sum 

C^ of digits" , the base B is indeed optimal for the numbers of S. In this paper we 

present the optimal base problem and illustrate why it is relevant to the encoding 
of Pseudo-Boolean constraints to SAT. We also present an algorithm and show 
that our implementation is superior to current implementations. 

Pseudo-Boolean constraints take the form aiXi+a2X2 + - ■ ■+anXn > k, where 
oi, . . . , a„ are integer coefficients, xi, . . . , x„ are Boolean literals (i.e.. Boolean 
variables or their negation), and k is an integer. We assume that constraints are 
in Pseudo-Boolean normal form [3], that is, the coefficients ai and k are always 
positive and Boolean variables occur at most once in aixi + 02X2 + • • • + o„x„. 
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Pseudo-Boolean constraints are well studied and arise in many different contexts, 
for example in verification [6] and in operations research [5]. Typically we are 
interested in the satisfiability of a conjunction of Pseudo-Boolean constraints. 
Since 2005 there is a series of Pseudo-Boolean Evaluations [10] which aim to 
assess the state of the art in the field of Pseudo-Boolean solvers. We adopt these 
competition problems as a benchmark for the techniques proposed in this paper. 

Pseudo-Boolean constraint satisfaction problems are often reduced to SAT. 
Many works describe techniques to encode these constraints to propositional for- 
mulas [1,2,8]. The Pseudo-Boolean solver MiniSat+ ([8], cf. http://minisat . 
se) chooses between three techniques to generate SAT encodings for Pseudo- 
Boolean constraints. These convert the constraint to: (a) a BDD structure, (b) a 
network of sorters, and (c) a network of (binary) adders. The network of adders 
is the most concise encoding, but it has the weakest propagation properties and 
often leads to higher SAT solving times than the BDD based encoding, which, on 
the other hand, generates the largest encoding. The encoding based on sorting 
networks is often the one applied and is the one we consider in this paper. 

To demonstrate how sorters can be used to 
translate Pseudo-Boolean constraints, consider the 
constraint ip = Xi + X2 + x^ + 2x4 + 3x5 ^ 4 where 
the sum of the coefficients is 8. On the right, we 
illustrate an 8 x 8 sorter where xi,X2,X3 are each 
fed into a single input, X4 into two of the inputs, 
and X5 into three of the inputs. The outputs are 
2/1, . . . , j/g. First, we represent the sorting network as a Boolean formula, 93, which 
in general, for n inputs, will be of size 0(nlog n) [4]. Then, to assert ip we take 
the conjunction of (p with the formula j/i A 2/2 A j/3 A 2/4. 

But what happens if the coefficients in a constraint are larger than in this 
example? How should we encode 16xi -I- 30x2 + 54x3 -I- 30x4 + 6OX5 > 87? How 
should we handle very large coefficients (larger than 1,000,000)? To this end, the 
authors in [8] generalize the above idea and propose to decompose the constraint 
into a number of interconnected sorting networks. Each sorter represents a digit 
in a mixed radix base. This construction is governed by the choice of a suitable 
mixed radix base and the objective is to find a base which minimizes the size of 
the sorting networks. Here the optimal base problem comes in, as the size of the 
networks is directly related to the size of the representation of the coefficients. 
We consider the sum of the digits (of coefficients) and other measures for the size 
of the representations and study their influence on the quality of the encoding. 

In M1N1SAT+ the search for an optimal base is performed using a brute force 
algorithm and the resulting base is constructed from prime numbers up to 17. 
The starting point for this paper is the following remark from [8] (Footnote 8): 

This is an ad-hoc solution that should be improved in the future. Finding 
the optimal base is a challenging optimization problem in its own right. 

In this paper we take the challenge and present an algorithm which scales to find 
an optimal base consisting of elements with values up to 1,000,000. We illustrate 
that in many cases finding a better base leads also to better SAT solving time. 



Section 2 provides preliminary definitions and formalizes the optimal base 
problem. Section 3 describes how MiNiSAT+decomposes a Pseudo-Boolean con- 
straint with respect to a given mixed radix base to generate a corresponding 
propositional encoding, so that the constraint has a solution precisely when 
the encoding has a model. Section 4 is about (three) alternative measures with 
respect to which an optimal base can be found. Sections 5-7 introduce our algo- 
rithm based on classic AI search methods (such as cost underapproximation) in 
three steps: Heuristic pruning, best-first branch and bound, and base abstrac- 
tion. Sections 8 and 9 present an experimental evaluation and some related work. 
Section 10 concludes. Proofs are given in the appendix. 

2 Optimal Base Problems 

In the classic base r radix system, positive integers are represented as finite 
sequences of digits d = (do, • • • ,rffe) where for each digit < d^ < r, and for 
the most significant digit, dk > 0. The integer value associated with d is w = 
do + dir + d2r^ -I- • • • -|- dkT^ . A mixed radix system is a generalization where a 
base is an infinite radix sequence B = (ro, ri, r2, . . .) of integers where for each 
radix, r^ > 1 and for each digit, < di < ri. The integer value associated with d 
\sv = doWo + diWi + d2W2 -!-••• + d^Wk where Wq = 1 and for i > 0, Wi+i = WiVi. 
The sequence weights{B) = {wq, wi,W2, ■ ■ ■) specifies the weighted contribution 
of each digit position and is called the weight sequence of B. A finite mixed radix 
base is a finite sequence B — (ro, ri, . . . , rk-i) with the same restrictions as for 
the infinite case except that numbers always have fc + 1 digits (possibly padded 
with zeroes) and there is no bound on the value of the most significant digit, dk- 

In this paper we focus on the representation of finite multisets of natural 
numbers in finite mixed radix bases. Let Base denote the set of finite mixed 
radix bases and ms(N) the set of finite non-empty multisets of natural numbers. 
We often view multisets as ordered (and hence refer to their first element, second 
element, etc.). For a finite sequence or multiset S of natural numbers, we denote 
its length by \S\, its maximal element by max{S), its i*'' element by S{i), and the 
multipfication of its elements by H -S* (if 5* is the empty sequence then H '5' = !)• 
If a base consists of prime numbers only, then we say that it is a prime base. 
The set of prime bases is denoted BasCp. 

Let B G Base with \B\ = k. We denote by V(^b) — {do, di, . . . , dk) the repre- 
sentation of a natural number v in base B. The most significant digit of V(_b), 
denoted ■msd{v^g-f), is dk- If 'msd{v(^B)) = then we say that B is redundant for 
V- Let S G ms(N) with |5| = n. We denote the n x (fc -I- 1) matrix of digits of 
elements from S in base B as S'(_b)- Namely, the z"* row in 5(b) is the vector 
S{i)(^B)- The most significant digit column of 5(b) is the fc -I- 1 column of the 
matrix and denoted msd{S(^B))- If 'msd{S(^B)) = (0, • • • , 0)-^, then we say that B 
is redundant for 5. This is equivalently characterized by J^i? > max{S)- 

Definition 1 (non- redundant bases). Let S G ms{H)- We denote the set 
of non-redundant bases for S, Base{S) = { i? G Base | 0-^ — 'mO'X{S) }. The 
set of non-redundant prime bases for S is denoted Basep{S)- The set of non- 



redundant (prime) bases for S, containing elements no larger than £, is denoted 
Base (S) (Base (S)). The set of bases in Base{S)/Base {S) /Base AS), is often 
viewed as a tree with root ( ) (the empty base) and an edge from B to B' if and 
only if B' is obtained from B by extending it with a single integer value. 

Definition 2 {sum_digits). Let S G m.s(N) and B G Base. The .sum of the 
digits of the numbers from S in base B is denoted sum_digits{SiB))- 

Example 3. The usual binary "base 2" and ternary "base 3" are represented as 
the infinite sequences Bi = (2, 2, 2, . . .) and B2 = (3, 3, 3, . . .). The finite sequence 
-63 = (3, 5, 2, 2) and the empty sequence B4 = {) are also bases. The empty base 
is often called the "unary base" (every number in this base has a single digit). 
Let S = {16,30,54,60}. Then, sum-digits{S(Bi)) — 13, sum_digits{S(B2)) — 12, 
sum -digits {S(B 3)) = 9, and sum -digits {S(b 4)) ~ 160. 

Let S G ms(N). A cost function for S* is a function co.sts '■ Base — > M which 
associates bases with real numbers. An example is costs{B) — sum -digits [S^b))- 
In this paper we are concerned with the following optimal base problem. 

Definition 4 (optimal base problem). Let S G ms(N) and costs « cost 
function. We say that a base B is an optimal base for S with respect to costs, if 
for all bases B' , costs{B) < costs{B'). The corresponding optimal base problem 
is to find an optimal base B for S . 

The following two lemmata confirm that for the sum-digits cost function, we 
may restrict attention to non-redundant bases involving prime numbers only. 

Lemma 5. Let S G ms(N) and consider the sum-digits cost function. Then, S 
has an optimal base in Base{S). 

Lemma 6. Let S G ms(N) and consider the sum-digits cost function. Then, S 
has an optimal base in Base.p{S). 

How hard is it to solve an instance of the optimal base problem (namely, 
for S G ms(N))? The following lemma provides a polynomial (in max{S)) upper 
bound on the size of the search space. This in turn suggests a pseudo-polynomial 
time brute force algorithm (to traverse the search space). 

Lemma 7. Let S G ?7is(N) with m = max{S). Then, Base{S) < m^^'' where 
p = (^^{2) « 1.73 and where ^ is the Riemann zeta function. 

Proof. Chor et al. prove in [7] that the number of ordered factorizations of a 
natural number n is less than n'' . The number of bases for all of the numbers in 
S is hence bounded by '^n<m '^''j which is bounded by m^^''. 

3 Encoding Pseudo-Boolean Constraints 

This section presents the construction underlying the sorter based encoding of 
Pseudo-Boolean constraints applied in MiniSat"*" [8] . It is governed by the choice 
of a mixed radix base B, the optimal selection of which is the topic of this paper. 
The construction sets up a series of sorting networks to encode the digits, in base 



B, of the sum of the terms on the left side of a constraint ip — aixi +02X2 + • • • + 
fin^n ^ k. The encoding then compares these digits with those from ktg\ from 
the right side. We present the construction, step by step, through an example 
where B = (2, 3, 3) and ip = 2x1 + 2x2 + 2x3 + 2x4 + 8x5 + ISxg > 23. 

Step one - representation in base: 

The coefficients of tp form a multiset S = {2, 2, 2, 2, 5, 18} 
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and their representation in base _B, a 6 x 4 matrix, S", 
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depicted on the right. The rows of the matrix correspond 
to the representation of the coefficients in base B. 

Step two - counting: Representing the coefficients as four digit numbers in 
base B = (2,3,3) and considering the values weights{B) — (1,2,6,18) of the 
digit positions, we obtain a decomposition for the left side of tp: 
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6-(0) + 18-(x6) 

To encode the sums at each digit position (1, 2, 6, 18), we set up a series of four 
sorting networks as depicted below. Given values for the variables, the sorted 
outputs from these net- 
works represented unary 
numbers di , (i2 , ds , di such 
that the left side of tp 
takes the value 1 • di + 2 • 
^2 + 6 • ^3 + 18 • d4. 
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Step three - converting to base: For the outputs ^1,^2,^3,^4 to represent 
the digits of a number in base B — (2, 3, 3), we need to encode also the "carry" 
operation from each digit position to the next. The first 3 outputs must rep- 
resent valid digits for B, i.e., unary numbers less than (2,3,3) respectively. 
In our example the single potential violation to this restriction is d2, which 
is represented in 6 bits. To this end we add two components to the encod- 
ing: (1) each third output of the second network (2/3 and j/g in the diagram) 
is fed into the third network as an additional (carry) input; and (2) clauses 
are added to encode that the output of the second network is to be consid- 
ered modulo 3. We call these additional clauses a normalizer. The normalizer 
defines two outputs R = 
(''ij'''2) and introduces 
clauses specifying that 
the (unary) value of 
R equals the (unary) 
value of ^2 mod 3. 
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Step four - comparison: The outputs from these four units now specify a 
number in base B, each digit represented in unary notation. This number is now 
compared (via an encoding of the lexicographic order) to 23(b) (the value from 
the right-hand side of ^p). 



4 Measures of Optimality 

We now return to the objective of this paper: For a given Pseudo-Boolean con- 
straint, how can we choose a mixed radix base with respect to which the encoding 
of the constraint via sorting networks will be optimal? We consider here three 
alternative cost functions with respect to which an optimal base can be found. 
These cost functions capture with increasing degree of precision the actual size 
of the encodings. 

The first cost function, sum_digits as introduced in Definition 2, provides a 
coarse measure on the size of the encoding. It approximates (from below) the 
total number of input bits in the network of sorting networks underlying the 
encoding. An advantage in using this cost function is that there always exists an 
optimal base which is prime. The disadvantage is that it ignores the carry bits 
in the construction, and as such is not always a precise measure for optimality. 
In [8], the authors propose to apply a cost function which considers also the 
carry bits. This is the second cost function we consider and we call it sum^carry. 

Definition 8 (cost function: sum^carry). Let S E ms(N), B E Base with 
\B\ = k and S(^b) — (o-ij) the corresponding n x (fc + 1) matrix of digits. De- 
note the sequences s — {sq, si, . . . , sj.) (sums) and c = (cq, ci, . . . , c^,) (carries) 
defined by: Sj — J2^=i '^ij forO < j < k, cq — 0, andcj^i ~ (sj+Cj) div B{j) for 
< j < k. The "sum of digits with k 

carry" cost function is defined by the sum_carry{Sm)) == / (■Sj+Cj) 

equation on the right. j=a 

The following example illustrates the sum_carry cost function and that it 
provides a better measure of base optimality for the (size of the) encoding of 
Pseudo-Boolean constraints. 

Example 9. Consider the encoding of a Pseudo-Boolean constraint with coeffi- 
cients 5 = { 1, 3, 4, 8, 18, 18 } with respect to bases: Bi = (2, 3, 3), B2 = (3, 2, 3), 
and B3 = (2, 2, 2, 2). Figure 1 depicts the sizes of the sorting networks for each 
of these bases. The upper tables illustrate the representation of the coefficients 
in the corresponding bases. In the lower tables, the rows labeled "sum" indicate 
the number of bits per network and (to their right) their total sum which is the 
sum-digits cost. With respect to the sum-digits cost function, all three bases are 
optimal for S, with a total of 9 inputs. The algorithm might as well return ^3. 
The rows labeled "carry" indicate the number of carry bits in each of the 
constructions and (to their right) their totals. With respect to the sum_carry 
cost function, bases Bi and B2 are optimal for S, with a total of 9 + 2 = 11 bits 
while Bg, involves 9 + 5 = 14 bits. The algorithm might as well return Bi. 

The following example shows that when searching for an optimal base with 
respect to the sum_carry cost function, one must consider also non-prime bases. 

Example 10. Consider again the Pseudo Boolean constraint i/i = 2x1 -I- 2x2 + 
2x3 -|- 2x4 + 5x5 + 18x6 > 23 from Section 3. The encoding with respect to 
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Fig. 1. Number of inputs/carries/comparators when encoding S — {1,3,4,8,18,18} 
and three bases Bi = (2,3,3), B2 = (3,2,3), and B3 = (2,2,2,2) . 

Bi = (2,3,3) results in 4 sorting networks with 10 inputs from the coefficients 
and 2 carries. So a total of 12 bits. The encoding with respect to B2 = (2, 9) is 
smaller. It has the same 10 inputs from the coefficients but no carry bits. Base 
B2 is optimal and non-prime. 

We consider a third cost function which we call the num^comp cost function. 
Sorting networks are constructed from "comparators" [9] and in the encoding 
each comparator is modeled using six CNF clauses. This function counts the 
number of comparators in the construction. Let f{n) denote the number of 
comparators in an n x n sorting network. For small values of < n < 8, /(n) 
takes the values 0, 0, 1, 3, 5, 9, 12, 16 and 19 respectively which correspond to the 
sizes of the optimal networks of these sizes [9]. For larger values, the construction 
uses Batcher's odd-even sorting networks [4] for which f{n) ~ n ■ [log2 n\ ■ 
(riog2nl-l)/4 + n-l. 

Definition 11 (cost function: num_comp). Consider the same setting as in 
Definition 8. Then, k 

num_comp{S(^B)) — /^ f{sj + Cj) 

Example 12. Consider again the setting of Example 9. In Figure 1 the rows la- 
beled "comp" indicate the number of comparators in each of the sorting networks 
and their totals. The construction with the minimal number of comparators is 
that obtained with respect to the base B2 = (3, 2, 3) with 10 comparators. 



It is interesting to remark the following relationship between the three cost 
functions: The sum^digits function is the most "abstract". It is only based on 
the representation of numbers in a mixed radix base. The sum^carry function 
considers also properties of addition in mixed-radix bases (resulting in the carry 
bits). Finally, the num-comp function considers also implementation details of 
the odd-even sorting networks applied in the underlying MiniSat"*" construction. 
In Section 8 we evaluate how the alternative choices for a cost function influence 
the size and quality of the encodings obtained with respect to corresponding 
optimal bases. 



5 Optimal Base Search I: Heuristic Pruning 

This section introduces a simple, heuristic-based, depth-first, tree search algo- 
rithm to solve the optimal base problem. The search space is the domain of 
non-redundant bases as presented in Definition 1. The starting point is the brute 
force algorithm applied in MiniSat+. For a sequence of integers S, MiniSat+ 
applies a depth-first traversal of Base (S) to find the base with the optimal 
value for the cost function costs{B) ~ sum -carry {S(^b))- 

Our first contribution is to introduce a heuristic function and to identify 
branches in the search space which can be pruned early on in the search. Each 
tree node B encountered during the traversal is inspected to check if given the 
best node encountered so far, bestB, it is possible to determine that all de- 
scendants of B are guaranteed to be less optimal than bestB. In this case, the 
subtree rooted at B may be pruned. The resulting algorithm improves on the one 
of MiniSat+ and provides the basis for the further improvements introduced in 
Sections 6 and 7. We need first a definition. 

Definition 13 (base extension, partial cost, and admissible heuristic). 

Let S G ms{N), B,B' G Base(S), and costs o, cost function. We say that: (1) 
B' extends B , denoted B' )^ B, if B is a prefix of B' , (2) dcosts is a partial 
cost function for costs ifiB' >- B. costs{B') > dcosts{B), and (3) hs is an 
admissible heuristic function for costs o,''^d dcosts if VS' >- B. costs{B') > 
dcostsiB') + hs{B') > dcosts{B) + hs{B). 

The intuition is that dcosts{B) signifies a part of the cost of B which will be a 
part of the cost of any extension of B, and that hs (B) is an under-approximation 
on the additional cost of extending B (in any way) given the partial cost of i3. We 
denote cosfgiB) = dcostsiB) + hs{B). If dcosts is a partial cost function and 
hs is an admissible heuristic function, then cost's {B) '^^ ^^ under-approximation 
of costs{B). The next lemma provides the basis for heuristic pruning using the 
three cost functions introduced above. 

Lemma 14. The following are admissible heuristics for the cases when: 

1. costs{B) = sum-digits{S[B))' dcosts{B) — costs {B) — 'Y^msd{S(^B)) ■ 

2. cost s (B) ^ sum -carry {S(B))- d cost s (B) = costs (B) — J2''^^d{S(^B))- 

3. costs{B) = num_comp{S(^B))' dcosts{B) — costs{B) — f{s\B\ + c\b\)- 

In the first two settings we take hs(B) ={ a;GS'a:;>J^_B}. 

In the case of num_comp we take the trivial heuristic estimate hs{B) = 

The algorithm, which we call df sHP for depth-first search with heuristic prun- 
ing, is now stated as Figure 2 where the input to the algorithm is a multiset 
of integers S and the output is an optimal base. The algorithm applies a depth- 
first traversal of Base{S) in search of an optimal base. We assume given: a cost 
function costs, a partial cost function dcosts and an admissible heuristic hs. 
We denote costs{B) = dcosts{B) + hs{B). The abstract data type base has 
two operations: extend(iiit) and extenders (multiset). For a base B and an 



/»input*/ multiset S 

/*lnlt*/ base bestB = (2,2, ...,2) 

/*dfs*/ depth-first traverse Base{S) 

at each node B, for the next value p G B . extenders (S) do 
base newB = B. extend (p) 
if (cosig (newB) > cosig (bestB)) prune 
else if (cosis (newB) < cosis (bestB)) bestB = newB 

/♦output*/ return bestB; 

Fig. 2. df sHP: depth-first search for an optimal base with heuristic pruning 

integer p, B . extend (p) is the base obtained by extending B by p. For a multiset 
S, B. extenders (S) is the set of integer values p by which B can be extended 
to a non-redundant base for S, i.e., such that J^B. extend (p) < max{S). The 
definition of this operation may have additional arguments to indicate if we seek 
a prime base or one containing elements no larger than £. 

Initialization (/*init*/ in the figure) assigns to the variable bestB a finite 
binary base of size [log2(TOax(S'))J . This variable will always denote the best base 
encountered so far (or the initial finite binary base). Throughout the traversal, 
when visiting a node newB wc first check if the subtree rooted at newB should be 
pruned. If this is not the case, then we check if a better "best base so far" has 
been found. Once the entire (with pruning) search space has been considered, 
the optimal base is in bestB. 

To establish a bound on the complexity of the algorithm, denote the num- 
ber of different integers in 5 by s and m = max{S). The algorithm has space 
complexity 0(log(TO)), for the depth first search on a tree with height bound by 
log(TO) (an element of Base{S) will have at most log2(Tn) elements). For each 
base considered during the traversal, we have to calculate costs which incurs 
a cost of 0{s). To see why, consider that when extending a base i? by a new 
element giving base B' , the first columns of S(b') are the same as those in S(^b) 
(and thus also the costs incurred by them). Only the cost incurred by the most 
significant digit column of S(^b) needs to be recomputed for S(^b') due to base 
extension of B to B' . Performing the computation for this column, we compute a 
new digit for the s different values in S. Finally, by Lemma 7, there are 0{m?'^^) 
bases and therefore, the total runtime is 0{s * ni^-^^). Given that s < m, we can 
conclude that runtime is bounded by 0('m^'^^). 



6 Optimal Base Search II: Branch and Bound 

In this section we further improve the search algorithm for an optimal base. The 
search algorithm is, as before, a traversal of the search space using the same 
partial cost and heuristic functions as before to prune the tree. The difference 
is that instead of a depth first search, we maintain a priority queue of nodes for 
expansion and apply a best- first, branch and bound search strategy. 

Figure 3 illustrates our enhanced search algorithm. We call it B&B. The ab- 
stract data type priority_queue maintains bases prioritized by the value of 



base findBase (multiset S) 

1*1*1 base bestB = (2,2, ...,2); priority_queue Q = |( )}; 

1*1*1 while (Q / {} M cosfg(Q.peek()) < cosis (bestB)) 

/*3*/ base B = Q.popMinO; 

/*4*/ foreach (p G B. extenders (S) ) 

/*5*/ base newB = B.extend(p); 

/*6*/ if (cosis(newB) < cosis (bestB)) 

/*7*/ { Q.push(newB) ; if (cosi5 (newB) < cosis (bestB)) bestB = newB; } 

/*8*/ return bestB; 

Fig. 3. Algorithm B&B: best-first, branch and bound 

costg. Operations popMinO, pusli(base) and peekO (peeks at the minimal en- 
try) are the usual. The reason to box the text "priority_queue" in the figure 
will become apparent in the next section. 

On line /*i*/ in the figure, we initialize the variable bestB to a finite binary 
base of size [\og2{max{S))\ (same as in Figure 2) and initiahze the queue to 
contain the root of the search space (the empty base). As long as there are still 
nodes to be expanded in the queue that are potentially interesting (line /*2*/), 
we select (at line /*3*/) the best candidate base B from the frontier of the tree 
in construction for further expansion. Now the search tree is expanded for each 
of the relevant integers (calculated at line /*4*/). For each child newB of B (line 
/*5*/), we check if pruning at newB should occur (line /*&*/) and if not we check 
if a better bound has been found (line /*7*/) Finally, when the loop terminates, 
we have found the optimal base and return it (line /*8*/). 



7 Optimal Base Search III: Search Modulo Product 

This section introduces an abstraction on the search space, classifying bases 
according to their product. Instead of maintaining (during the search) a priority 
queue of all bases (nodes) that still need to be explored, we maintain a special 
priority queue in which there will only ever be at most one base with the same 
product. So, the queue will never contain two different bases Bi and B2 such 
that n -81 = n -^2- In case a second base, with the same product as one already 
in, is inserted to the queue, then only the base with the minimal value of costg 
is maintained on the queue. We call this type of priority queue a hashed priority 
queue because it can conveniently be implemented as a hash table. 

The intuition comes from a study of the suni-digits cost function for which 
we can prove the following Property 1 on bases: Consider two bases Bi and 
B2 such that ri-^i =11-^2 and such that costg{Bi) < costg{B2)- Then for any 
extension of Bi and of B2 by the same sequence C, costs{BiC) < costs{B2C). 
In particular, if one of Bi or B2 can be extended to an optimal base, then Bi 
can. A direct implication is that when maintaining the frontier of the search 
space as a priority queue, we only need one representative of the class of bases 
which have the same product (the one with the minimal value of costg). 
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A second Property 2 is more subtle and true for any cost function that has 
the first property: Assume that in the algorithm described as Figure 3 wc at 
some stage remove a base Bi from the priority queue. This implies that if in the 
future we encounter any base B2 such that J^ -Bi = 11 ^2 , then we can be sure 
that costs{Bi) < costs{B2) and immediately prune the search tree from ,62 • 

Our third and final algorithm, which we call hashB&B (best- first, branch 
and bound, with hash priority queue) is identical to the algorithm presented in 
Figure 3, except that the the boxed priority queue introduced at line /*i*/ is 



replaced by a hash_priority_queue 



The abstract data type hash_priority_queue maintains bases prioritized by 
the value of costg. Operations popMinO and peekO are as usual. Operation 
push(i?i) works as follows: (a) if there is no base B2 in the queue such that 
H-Bi = 11-^2, then add Bi. Otherwise, (b) if cost'^{B2) < cost%{Bi) then do 
not add Bi. Otherwise, (c) remove B2 from the queue and add Bi. 

Theorem 15. 

(1) The sum_digits cost function satisfies Property 1; and (2) the hashB&B 
algorithm, finds an optimal base for any cost function which satisfies Property 1 . 

We conjecture that the other cost functions do not satisfy Property 1, and 
hence cannot guarantee that the hashB&B algorithm always finds an optimal base. 
However, in our extensive experimentation, all bases found (when searching for 
an optimal prime base) are indeed optimal. 

A direct implication of the above improvements is that we can now provide 
a tighter bound on the complexity of the search algorithm. Let us denote the 
number of different integers in S' by s and m = max(S). First note that in 
the worst case the hashed priority queue will contain m elements (one for each 
possible value of a base product, which is never more than m). Assuming that 
we use a Fibonacci Heap, we have a 0(log(?n)) cost (amortized) per popMinO 
operation and in total a 0{m * log(rTT-)) cost for popping elements off the queue 
during the search for an optimal base. 

Now focus on the cost of operations performed when extracting a base B 
from the queue. Denoting Y[B = q, B has at most m/q children (integers which 
extend it). For each child we have to calculate costs which incurs a cost of 0{s) 
and possibly to insert it to the queue. Pushing an element onto a hashed priority 
queue (in all three cases) is a constant time operation (amortized), and hence 
the total cost for deahng with a child is 0{s). 

Finally, consider the total number of children created during the search which 
corresponds to the following sum: 

m m 

OC/^ m/q) — 0{m\^ 1/g) = 0{m * log(?7i)) 

So, in total we get 0{m * log(?TT-)) + 0{m * log(m) * s) < 0{m^ * log(?7i)). When 
we restrict the extenders to be prime numbers then we can further improve this 
bound to 0{m^ * log(log(TO))) by reasoning about the density of the primes. A 
proof can be found in the appendix. 
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8 Experiments 

Experiments are performed using an extension to MiniSat"'" [8] where the only 
change to the tool is to plug in our optimal base algorithm. The reader is invited 
to experiment with the implementation via its web interface.^ All experiments 
are performed on a Quad-Opteron 848 at 2.2 GHz, 16 GB RAM, running Linux. 
Our benchmark suite originates from 1945 Pseudo-Boolean Evaluation [10] 
instances from the years 2006-2009 containing a total of 74,442,661 individual 
Pseudo-Boolean constraints. After normalizing and removing constraints with 
{ 0, 1 } coefficients we are left with 115,891 different optimal base problems where 
the maximal coefficient is 2"^^ — 1. We then focus on 734 PB instances where at 
least one optimal base problem from the instance yields a base with an element 
that is non-prime or greater than 17. When solving PB instances, in all experi- 
ments, a 30 minute timeout is imposed as in the Pseudo-Boolean competitions. 
When solving an optimal base problem, a 10 minute timeout is applied. 

Experiment 1 (Impact of optimal bases): The first experiment illustrates the ad- 
vantage in searching for an optimal base for Pseudo-Boolean solving. We compare 
sizes and solving times when encoding w.r.t. the binary base vs. w.r.t. an optimal 
base (using the hashB&B algorithm with the num_com,p cost function). Encoding 
w.r.t. the binary base, we solve 435 PB instances (within the time limit) with an 
average time of 146 seconds and average CNF size of 1.2 million clauses. Using 
an optimal base, we solve 445 instances with an average time of 108 seconds, 
and average CNF size of 0.8 million clauses. 

Experim,ent 2 (Base search time): Here we focus on the search time for an opti- 
mal base in six configurations using the suru-carry cost function. Configurations 
M17, df sHP17, and B&B17, are respectively, the MiniSat+ implementation, our 
dfsHP and our B&B algorithms, all three searching for an optimal base from 
BaseY^ , i.e., with prime elements up to 17. Configurations hashB&Bl,000,000, 
hashB&BlO.OOO, and hashB&B17 are our hashB&B algorithm searching for a base 
from Basel ^^*^ bounds of ^ = 1,000,000, £ = 10,000, and £ = 17, respectively. 

Results are summarized in Fig. 4 which is obtained as follows. We cluster 
the optimal base problems according to the values [log^ 9745 M] where M is the 
maximal coefficient in a problem. Then, for each cluster we take the average run- 
time for the problems in the cluster. The value 1.9745 is chosen to minimize the 
standard deviation from the averages (over all clusters) . These are the points on 
the graphs. Configuration M17 times out on 28 problems. For df sHP17, the max- 
imal search time is 200 seconds. Configuration B&B 17 times out for 1 problem. 
The hashB&B configurations have maximal runtimes of 350 seconds, 14 seconds 
and 0.16 seconds, respectively for the bounds 1,000,000, 10,000 and 17. 

Fig. 4 shows that: (left) even with primes up to 1,000,000, hashB&B is faster 
than the algorithm from M1N1SAT+ with the limit of 17; and (right) even with 
primes up to 10,000, the search time using hashB&B is essentially negligible. 



http : //aprove . inf ormatik . rwth-aachen.de/forms/unified_form_PBB . asp 
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Fig. 4. Experiment 2: the 3 slowest configurations (left) (from back to front) M17(blue), 
hashB&B 1,000, 000 (orange) and dfsHP 17 (yellow). The 4 fastest configurations (right) 
(from back to front) dfsHP 17 (yellow), B&B17(green), hashB&B10,000(brown) and 
hashB&B 17 (azure). Note the change of scale for the j/-axis with 50k ms on the left 
and 8k ms on the right. Configuration df sHP17 (yellow) is lowest on left and highest 
on right, setting the reference point to compare the two graphs. 

Experiment 3 (Impact on PB solving): Fig. 5 illustrates the influence of improved 
base search on SAT solving for PB Constraints. Both graphs depict the number 
of instances solved (the x-axis) within a time limit (the y-axis). On the left, total 
solving time (with base search), and on the right, SAT solving time only. 



1200000 

0) 

^ 1000000 




>-Minisat+ 17 
I Binary base 
F- hashB&B sum 

of digits 

10,000 
*■ hashB&B carry 

cost 10,000 
-hashB&B 

comp cost 

10,000 



'-Minisat+ 17 



□ Binary base 
m" 1400000 -T- hashB&B sum 
of digits 
i 10,000 

hashB&B carry 
cost 10,000 
hashB&B 
comp cost 



1200000 

C 1000000 




10,000 



Number of instances solved Number of instances solved 

Fig. 5. Experiment 3: total times (left), solving times (right) 

Both graphs consider the 734 instances of interest and compare SAT solving 
times -with bases found using five configurations. The first is MiNiSAT+with 
configuration M17, the second is -with respect to the binary base, the third to 
fifth are hashB&B searching for bases from Basep^''^'^^{S) with cost functions: 
sum^digits, sum^carry, and num^comp, respectively. The average total/solve 
run-times (in sec) are 150/140, 146/146, 122/121, 116/115 and 108/107 (left 
to right). The total number of instances solved are 431, 435, 442, 442 and 445 
(left to right). The average CNF sizes (in millions of clauses) for the entire test 
set/the set where all algorithms solved/the set where no algorithm solved are 
7.7/1.0/18, 9.5/1.2/23, 8.4/1.1/20, 7.2/0.8/17 and 7.2/0.8/17 (left to right). 

The graphs of Fig. 5 and average solving times clearly show: (1) SAT solving 
time dominates base finding time, (2) MiniSat+ is outperformed by the trivial 
binary base, (3) total solving times with our algorithms are faster than with the 
binary base, and (4) the most specific cost function (comparator cost) outper- 
forms the other cost functions both in solving time and size. Finally, note that 
sum of digits with its nice mathematical properties, simplicity, and application 
independence solves as many instances as cost carry. 
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Experiment 4 (Impact of high 
prime factors): This experi- 
ment is about the effects of 
restricting the maximal prime 
value in a base (i.e. the value 
^ = 17 of M1N1SAT+). An 
analysis of the our bench- 
mark suite indicates that coef- 
ficients with small prime fac- 
tors are overrepresented. To 
introduce instances where co- 
efficients have larger prime 
factors we select 43 instances 
from the suite and multiply 
their coefficients to introduce 
the prime factor 31 raised to 
the power i £ {0, . . . , 5}. We 
also introduce a slack variable 
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Fig. 6. Experiment 4: Number (a;-axis) of instances 
encoded within number of clauses (j/-axis) on 4 
configurations. From top line to bottom: (yellow) 
e = 17, i = 5, (red) £ = 17, i = 2, (green) £ = 31, 
i = 5, and (blue) I G {17, 31}, i = 0. 



to avoid gcd-based simplification. This gives us a collection of 258 new instances. 
We used the B&B algorithm with the sum_carry cost function applying the limit 
£ = 17 (as in M1N1SAT+) and i = 31. Results indicate that for £ = 31, both CNF 
size and SAT-solving time are independent of the factor 31' introduced for z > 0. 
However, for £ — 17, both measures increase as the power i increases. Results on 
CNF sizes are reported in Fig. 6 which plots for 4 different settings the number 
of instances encoded (a;- axis) within a CNF with that many clauses (j/-axis). 

9 Related Work 

Recent work [2] encodes Pseudo-Boolean constraints via "totalizers" similar to 
sorting networks, determined by the representation of the coefficients in an un- 
derlying base. Here the authors choose the standard base 2 representation of 
numbers. It is straightforward to generalize their approach for an arbitrary mixed 
base, and our algorithm is directly applicable. In [11] the author considers the 
sum-digits cost function and analyzes the size of representing the natural num- 
bers up to n with (a particular class of) mixed radix bases. Our Lemma 6 may 
lead to a contribution in that context. 

10 Conclusion 

It has been recognized now for some years that decomposing the coefficients in 
a Pseudo-Boolean constraint with respect to a mixed radix base can lead to 
smaller SAT encodings. However, it remained an open problem to determine if 
it is feasible to find such an optimal base for constraints with large coefficients. 
In lack of a better solution, the implementation in the MiniSat"*" tool applies a 
brute force search considering prime base elements less than 17. 



14 



To close this open problem, we first formalize the optimal base problem and 
then significantly improve the search algorithm currently applied in MiniSat+. 
Our algorithm scales and easily finds optimal bases with elements up to 1,000,000. 
We also illustrate that, for the measure of optimality applied in MiniSat+, one 
must consider also non-prime base elements. However, choosing the more simple 
sum^digits measure, it is sufficient to restrict the search to prime bases. 

With the implementation of our search algorithm it is possible, for the first 
time, to study the influence of basing SAT encodings on optimal bases. We show 
that for a wide range of benchmarks, MiniSat+ does actually flnd an optimal 
base consisting of elements less than 17. We also show that many Pseudo-Boolean 
instances have optimal bases with larger elements and that this does influence 
the subsequent CNF sizes and SAT solving times, especially when coefficients 
contain larger prime factors. 

Acknowledgement We thank Daniel Berend and Carmel Domshlak for useful 
discussions. 
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A Appendix: Proofs 

A.l Proving Lemma 5 



Lemma 5. Let S G ms(N) and consider the sum^digits cost function. Then, S 
has an optimal base in Base{S). 

Proof. Let S E ms(N) and let B G Base he an optimal base for S with 
Y[B > max{S). Let B' be the base obtained by removing the last element 
from B. We show that msd{S(^B)) — (0, • • • , 0) and that sum_digits{S(^B)) = 
sum, -digits {S[B'))- The claim then follows. Assume falsely that for v Cz S, 

msd{v(^B)) > 0. Then we get the contradiction v =- 111=0 ^(-B)(0'^6*5'*^*(^)(*) ^ 
msd{v/B)) '11-^^ fTT-sdlvrB)) ' Tnax{S) > v. From the dehnition of sum_digits 
and the above contradiction we deduce that sum-digits{B) = .sum -digits (B'). 

It is straightforward to generalize Lemma 5 for the other cost functions con- 
sidered in this paper. 



A. 2 Proving Lemma 6 

Proposition 16 (Unique base representation). Let B ~ (ro, . . . ,rk-i) be a 
base with weights(B) — {wq, • • • , Wk) and let w G N. Then, the unique represen- 
tation of V in B is obtained as V(^b) — {d-o, • • • , c?fc) such that: (1) do = mod{v, tq), 
(2) di = mod{div{v,Wi),ri) for < i < k, and (3) d^ = div{v,Wk). 



Proof, (sketch) We have to show that for i < k, < di < Vi and that 
J2i=o di-Wi. The first pr 
is elaborated on below. 



J2i=o di-Wi. The first property follows directly from the construction. The second 



Proposition 17 (Base factoring). Let v eN and let Bi = (rg, . . . ,rk_i) and 
B2 = (^0' • • • ' ''fc-2) ^^ bases which are identical except that at some position 
Q < p < k — 1, two consecutive base elements in Bi are replaced in B2 by their 
multiplication. Formally: r[ = ri for < i < p, r'^ = rp ■ rp+i, and r[ = r^+i for 
p < i < k — I. Then, the .sum of the digits in f (,82) ^^ greater or equal to the sum 
of the digits in v/Bi) ■ 

Proof, (sketch) We Grst observe that f(_B2)(i) = V(^Bi){i) for < i < p and 
^(B2)(*) " ''^(Bi )(*+!) forp < i < fc— 1. So, it remains to show that J2i=o W(B2)(*)^ 
Z]i=o^(-Bi)(*) = 'f'(B2)(p)^"(-Bi)(p)-i'(Bi)(p+l) > 0. We elaborate on this below. 
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Lemma 6. Let S G m,s(N) and consider the suvi-digits cost function. Then, S 
has an optimal base in Basep(S). 

Proof. Let B2 be a base with k — 1 elements of the form 

B2 = (ri,...,rp_i,(rp ■ rp+i),rp+2, ■ ■ ■ ,rk~i) 

where the element {rp -rp^i) at position p is non-prime and Vp, r^+i > 1. So, tak- 
ing Bi = (ri, . . . ,rp_i,rp,rp+i,rp+2, • • • jf-fe-i), we are in the setting of Propo- 
sition 1 7. The result follows: 

fc-i k 



sum digits {S(B2)) - sum_digits{S(^Bi)) = X!X!^(^2)(^) ^ ^^'"(.Bi)(i) 

ves i=o ves i=o 

= J2^'"(B2)iP) - "(Bi)(p) - W(Bi)(P+ 1)) 



ves 
>0 



A. 3 Proving Lemma 14 

Consider the notation of Dehnition 8. Let S G ms(N), B G Base with \B\ — k 
and S'(B) = {aij). Denote the sequences s{B) = (s^, sf , . . . , sf ) (sums) and 
c{B) — {cq ,cf , . . . , cf ) (carries) defined by: sf ~ X]"^! ^ij ^^^ < j < k, Cq = 
0, and c^j^ = (s^ + c^) div B(j) for < j < k. We denote also inputs g (j) = 

Proposition 18. Let S E ms(N) and B,B' bases such that B' >~ B. Then, for 
1 < i < |S|, inputs'^ {j) = inputs'^' {j). 

Proposition 19. Let 7^ w e N, and B a base. Then for every < « < |i?| 
such that V > weights{B){i) there exists i < j < \B\ such that vtmU) > 0- 

Proposition 20. Let / : M h- > M &e any monotonically increasing function such 
that for x,y gR^, f{x + y) > f{x)+y. Let S € ms(N). Then {costg,d cost g,hg) 
given by: 



1. cost{{B) = ES+' fimputslij)) 

2. dcosti.{B) = (j:fj,f{znputs§{j))) + fidiv{inputsB(\B\),B{\B\ - 1))). 

3. h{{B) = |{a;eS'|a;>n-B}|. 

is a heuristic cost function. 

Proof. Let f and S be as defined above and B, B' bases such that B' >~ B, 
\B\ = k, and \B'\ = k' . For every base B" we can see that h^g^B") > (the 
size of a finite set is a natural number). Therefore it is sufficient to prove that 
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costs{B') > dcosts(B) + hs{B). From Proposition 18 we get that inputs g (k) 
inputs g (k). From the dehnition of inputs and of f we see that 



f{inputsf (fc + 1)) := /(^ vi^B') (k) + div{inputs§' (fc), B'(fc - 1))) 
ves 

= /(E «(S')(^) + div{inputs^{k),B{k - 1))) 
ves 

- ^'"{B')ik) + f{div {inputs s (k), B{k - 1))) 



ves 

For k < j < k' , denote nz{j) ~ { v € S \ W(b')(.7) > }. 

Let V G { X £ S \x >Y[B — weights{B'){k) }. By Proposition 19 there exists 

k < j < k' such that W(b')(j) > and therefore v G nz{j). This imphes that 

|{a;eS'|a;>n-B}| + f{div (inputs^ (k), B{k - 1))) < 

k' 

J2 J2 v^B'){]) + f{div{inputs^{k),B(k -!))) = 

j=k venz(j) 

k' fe' + l 

J2Y.''(B'){j) + f{div{inputs^{k),B{k-l)))< J2 fiinputsl'U)) 

j—k veS j—k+l 

In total we have 

costs(B') - dcosts{B) - hs(B) ^ 

\B'\ + 1 

J2 f {inputs f{j)) - f{div{inputs^{\B\),B{\B\ - 1))) -~ 

J = \B\ + 1 

\{xe S\x>llB}\ >0 

The proof that dcosts{B') + hs{B') > dcosts{B) + hs{B) is similar noting that 
\nz{k')\^hs{B'). 

Proof, (of Lemma 14) 

If f{x) ~ X, then costg — sum^carry. By Proposition 20 both definitions give 
a heuristic cost function. The proof for the case of num^comp and sum^digits 
is of the same structure as Proposition 20. The case of sum-carry is the most 
comphcated one. 

A. 4 Proving Theorem 15 

DeGnition 21. (Property 1) A heuristic cost function {cost s,d cost s,hs) is 
called base mul equivalent if for every S G ms(N) and for every bases Bi, B2 
such that Yi^i ^ n^2 and costg{Bi) < costg{B2) the following holds: 
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1. for any extension of Bi and of B2 by same base C , costs{BiC) < costs{B2C). 

In the following propositions we refer to the hashB&B search algorithm of 
Section 7. 

Proposition 22. Let S G 7ns(N) and let {costs, dcosts, hs) be a base mul equiv- 
alent heuristic cost function. For every base Bi extracted from the queue and for 
every base B2 such that J| -B2 = 11^1 then cosf^Bi < costgB2. 

Proof. Let Y[Bi ~ k. By complete induction on k. 

Base: For k = I we have only the empty base and the claim is trivially true. 



Step: Assume that the claim is true for every i < k and assume that during 
the run of the algorithm we extract a base Bi from the queue with Yi^i — k. 
Assume falsely that there exists a base B2 such that YIB2 — k and costg{B2) < 
cost'g{Bi).This means that B2 was not evaluated yet (otherwise Bi = B2). 
Therefore the father of B2 (in the tree of bases) was never extracted from the 
queue. Let A2 be the closest ancestor of B2 that was extracted. Denote by C2 
the child of A2 which is also the ancestor of B2 (potentially B2 itself). So, C2 
was evaluated. Observe that A2 and C2 are unique because the search space (of 
bases) is a tree and B2 >- C2. So, costg{Bi) < costg{C2) < cost'g{B2) and that 
is a contradiction to the existence of B2 . 



This proves that Property 1 derives Property 2. 

Proposition 23. Let v E N and B,B' bases such that B' >- B. Then, for < 
j < \B\ -1, V(^B')ii) ==W(i3)(i). 



Theorem 15. 

(1) The sum^digits cost function satisfies Property 1; and (2) the hashB&B 
algorithm hnds an optimal base for any cost function which satisfies Property 1. 

Proof. 

1) Let S e ms(N), Bi,B2 G Base and O-Bi = -82 e N. We prove that if 
costg{Bi) < costg(B2) then for any base extension C , costs{BiC) < costs{B2C). 
The proof is by complete induction on \C'\. 



Base: For \C\ = 1, first notice that hs{Bi) = hs{B2). This follows directly 
from the definition of admissible heuristics (for the sum-digits case). Hence, 
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dcosts{Bi) < dcosts{B2)- From Propositions 16 and 23, we have that 

\BiC\ 

costs{BiC) = X] 5Z ^(SiC) 
ves 1=0 

-Bil-l |SlC| 

^Y. Y. "(si) + I] Y. ^(Bic) 

ves 1=0 i>e5i=|Si 

= dcosts{Bi) + ^(mod{div{v, Y[ Bi), C(0)) + div{v, Y[ BiC)) 
ves 

= < dcostsiBi) + J2i-mod{div{v, Yl B2), C(0)) + div{v, JJ B2C)) 

ves 
S2I-1 IS2CI 

= Y Y ''(B2) + Y Y ^(B2C) 

ves 1=0 t>e5i=|S2 

S2C 

^Y Y ^(B2C) 

ves 1=0 
= costsiBiC) 



Step: For |C| = fc > 1, wc define C = C'{p). So by the complete induction 
assumption for \C'\ — k — 1 we get that costs{BiC') < costs{B2C'). By the fact 
that ri-^iC" = YIB2C' we can deduce that msd{S(^BiC')) = 'tnsd{Si^B2C'))- By 
the definition of admissible heuristics for sum_digits: 

dcosts{BiC") + Y^-'^d{S(^B,c")) ^ costs{BiC") < costs{B2C') 

- dcosts{B2C') + Y msd{S(B,c')) 

Therefore, dcosts{BiC') < dcosts{B2C'). Combining it with the fact that 
hs{BiC') = hs{B2C') we have that cosf^iBiC) < cost'l{B2C'). Finally from 
the inductive assumption we get that costs(BiC' (p)) < costs{B2C' (p)). 

2) Let {cost s,d cost s,hs) a base mul equivalent heuristic cost function and 
S e ms(N). We denote by bestB the best base found by the algorithm at 
each point of the run. Let B be the first base extracted from the queue such 
that costg{B) > costs{bestB). This is the condition that terminates the main 
loop of the algorithm, so we need to prove that bestB is the optimal base 
for S. Assume falsely that there exists an optimal base B' — F.{b) such that 
costs{B') < costs {bestB). Let A be the nearest ancestor of B' such that the its 
base equivalence class representative R was extracted from queue (R ^ B' oth- 
erwise costsibestB) < cost s{B')) . By Proposition 22 wc know that costg{R) < 
costg{A) and by Property 1 that for any base C costs{RC) < costs {AC). In 
particular for the case where AC = A.{b)C' = B' . By choice of A we get that 
the equivalence class representative of A.{b) was not extracted (and it is the 
same class of R.{b)). Therefore, costs{B') > costs{R.{b)C') > cost'^{R.{b)) > 
costg{B), which is a contradiction. 
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A. 5 On the complexity of the hashB&B algorithm for prime bases 

Theorem 24. Let S G ms(N) with m — max(S). Then, the complexity of the 
hashB&B algorithm for prime bases is 0{m^ * log(log(77i))). 

Proof. We use the prime number theorem which states that the density of the 
primes near x is 0{x/ log(x)). The number of prime bases evaluated in the worst 
case scenario is: 

^^-^ log{m/i) ^ I ■ log(m/i) 



But 



m log(m) 2* _. 

y ^ ^ y y ^ 

■^-^ i ■ loe(m/i) ^-^ ^-^ i ■ loe(m/i) 

1=1 ^^ ' ' k=l ^=2*=-! ^^ ' ' 



log(?n) 2^ 

^ E E 



,^1 ^^2,_, 2fe-i • (log(m) - log(z)) 

E 



" 2*=-! • (log(TO) -k + l) 



log(m) 



f^^ (log(m)-fc+l) 

log(m) 

^ - = 0(log(log(m))) 



k 
fe=i 



And so the total number of bases evaluated during a run of the algorithm is 
bounded by 0(?Ti-log(log(TO))) and the overall complexity is 0{m^ ■\og{\og{m))). 



A. 6 Integer division and modulus 

Let a, 6 G N with b > 0. It is standard to define div{a, b) and mod{a, b) as natural 
numbers such that a = div{a, b) ■ b + mod(a, b) and < mod{a, b) < b. In our 
proofs we note that div{a, b) is the maximal number such that there exists r € N 
with a — div{a, b) ■ b + r. 

Proposition 25. Leta,b,cENwithb,c>0. Thendiv{a,b-c) ^ div{div{a,b),c). 

Proof By definition if k =^ div{a, b ■ c), k' ~ div{a, b) and k" — div{k' , c). Then, 
a — k-c-b + r,a~k'-b + r' and k' — k" ■ c + r" . Now, k ■ c < k' because 
otherwise it would be a contradiction to the maximality of k' . If k ■ c = k' 
then div{div{a,b),c) — div{k ■ c,c) ^ k ~ div{a,b ■ c). Assume that k ■ c < k' . 
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Then a — k' ■ b + r' = k" ■ c ■ b + r" ■ b + r' and from that we deduce k" < k 
(otherwise it would be a contradiction to the maximaUty of k). On the other 
hand, < k' — k- c = k" ■ c + r" — k- c <r^ k ■ c— r" < k" ■ c. From the definition of 
modulus we get that r" < c and so [k — 1) ■ c < k ■ c ~ r" < k" ■ c, and therefore 
k — 1 < k" . In total we get that k — l<k"<k and because k and k" are natural 
numbers we get the equality. 



Proposition 26. Let a,b,c E N such that b,c > 0. Then, mod{a,b ■ c) = 
mod{a, b) + mod{div{a, b),c) ■ c. 

Proof. Let r — mod{a, b ■ c), r' — mod{a, b), k' = div{b, c), and r" = mod{k' , c). 
By definition we can see that a — k-b-c + r,a — k'-b + r', and k' = k" ■ c + r" . 
Therefore mod{a,b ■ c) = mod{a,b) + mod{div{a,b),c) -H- r = r' + r" ■ b ■^ 
a-k-b-c = a-k' ■b+{k'-k"-c)-b^ -k-c=-k' + {k' -k" ■c)^k-c^k" -c 
^ k — k'' ^ div{a, b ■ c) — div{k' , c) -H- div{a, b ■ c) = div{div{a, b), c) and this is 
true by Proposition 25. 



A. 7 The rest of the details 

For the sake of readability, we write weights(B) = B. 
Proof, (of Proposition 16 by induction on \B\). 

Base (length is 1): B — (b) and hence B — (1, 6), W(_b)(0) — mod{v, 6), W(b)(1) = 
div{v, b) and by definition of div and mod we can see that v ~ div{v, b) * b + 
mod{v, b). 

Step: Assume the assumption is true for every base B such that \B\ = fc — 1. Let 
\B\ = k such that B = {bo,bi, . . . ,bk-i). Define the base B' ~ {bobi,b2, . . . ,bk-i) 
with \B'\ = k-l. We can see that forO<i<k-l, B'{i) ^ B{i + 1), and that 
for < i < k, B'{i) = B{i + 1). From this we can see that for < i < k — I, 
'^{B'){i) = mod{div{v, B' (i)) , B' (i)) = mod{div{v, B{i + l)),B{i + 1)) = i'(b)(j + 
1) and also that V(gi-j(k — 1) = div{v,B'{k — 1)) = div{v,B{k)) = V(^B){k). 
Back to the main claim by the induction we know that v — X]i=o ''^(B'){i)B' (i). 
Therefore ELo^iB)i^)B{t) - «(i3)(0)S(0) + i;(B)(l)B(l) + ^-=2 «(B) W^(*) = 

^(B)(o)B(o) + «(B)(i)B(i) + E-=i' ^iB')m'{i) - viBmm + viB){i)m + 

v-v^B')iO)B'{0). 

By Proposition 26 we can see that = mod{v,bo) + mod{div{v,bo),bi) ■ bo — 
mod{v,bo ■ bi) = W(B)(0)B(0) + W(ij)(l)i?(l) - U(b')(0)B'(0). And therefore we 
get that Y.i=o'"(B){i)B{i) = v. 
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Proof, (of Proposition 18) 

First we notice that because B' y B then for < i < \B\ we get that B{i) = 
B'{i) and therefore B{i) = B'{i). By Proposition 16 we see that W(b)(*) = 
Vm') (*) • By definition ofS/Q-, and by Proposition 1 6 and the definition of inputs g 
we prove this proposition by induction on < i < \B\. 

1. Base case: inputs§{l) == J2ves'"iB){0) = J2ves'"iB')i^) ^ inputs^' {I) 

2. Step: Assume that inputs g{i — 1) = inputsg (i — 1). 
inputsg{i) = '^y^s ^{B){'i- ^ 1) + div{inputsg(i — 1), B{i — 2)) = 
J2vGS '^{B'){i ^ 1) + div{inputsg (i — 1), B'{i — 2)) = inputs^ (i) 

Proof, (of Proposition 19) 

Let V and B be as defined above. Let < i < |i3| such that v > B{i). If there 
exists an i < j < \B\ such that v = B{j) then we get that div{v,B{j)) = 1 
and in any case (j < \B\ or j = \B\) V(b){J) = 1- Otherwise let i < j < \B\ 
be the maximal index such that v > B{j). If j = \B\ then div{v,B{\B\)) > 0. 
Consider the case when j < \B\ Then div{v,B{j)) — k such that k ■ B{j) < 
k ■ B(j) +r = V < B{j) ■ B{j) and so by dividing both sides by B{j) we get that 
div(v,B{j)) < B{j), which means that W(b)(j) = mod{div(v,B{i)),B{j)) > . 

Proof, (of Proposition 17) 

The following proof is deeply based on the definition ofv(^g-^ and Proposition 16. 

Let V, Bi and B2 be as dehned above. 

1. For < i < p by definition Bi{i) ~ B2{i) which means that Bi{i) = B2{i) 
and therefore v^Bi){i) = ■*^(B2)(*)- ^'^^ p < i < k — 1 again we get that 
B2{i) ~ Bi{i + 1) and B2{i) ~ Bi(i + 1) which again means that W(B2)(*) = 
^'iB^)i^ + iy)■ 

2. We can see that Bi (p) — B2 (p) . By Proposition 16 we know that 

= v-v = ELo^'Bi W • Bi{i) - EtTo '^B2{i) ■ B2{i) = 

V{B,){P) ■ Si(p) + W(B,)(p + 1) • Bl(p+ 1) - V(B,){p) ■B2{p) = 

V(B^)ip) ■ Blip) + Vf^B^){p + '^) -Blip) ■a-V(B2){v) -Blip) 
Therefore v^b2){p) ^ '"{Bi){p) + V(Bi){p + 1) • a 
Because a > I we deduce that V(^b2){p) ^ '"(Bi){p) + ^(Bi)ip + 1) 
And in total we get that 



Proof (of Proposition 23) 

Let u G N and B,B' bases such that B' y B. Let < i < \B\ - 1 (\B\ > 0, 
otherwise there is no such index). For i ^ by Proposition 16 we get that 
V(B')iO) = m.od{v,B'{0)) = mod(v,B{0)) == 'y(s)(0). If i > then by Proposi- 
tion 16 and dehnition of B we can deduce W(B')(i) = mod{div(v,B' {i))^B'{i)) = 
mod{div{v,B(i)),B{i)) =V(^B){i)- 
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